AITopics | generative flow

Unsupervised learning of probabilistic models is a central yet challenging problem. Deep generative models have shown promising results in modeling complex distributions such as natural images (Radford et al.,2015), audio (Van Den Oord et al.,2016)and text (Bowman et al.,2015).

artificial intelligence, generative model, machine learning, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.15)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Neural Information Processing SystemsDec-24-2025, 02:07:40 GMT

Recently, text-to-speech (TTS) models such as FastSpeech and ParaNet have been proposed to generate mel-spectrograms from text in parallel. Despite the advantage, the parallel TTS models cannot be trained without guidance from autoregressive TTS models as their external aligners. In this work, we propose Glow-TTS, a flow-based generative model for parallel TTS that does not require any external aligner. By combining the properties of flows and dynamic programming, the proposed model searches for the most probable monotonic alignment between text and the latent representation of speech on its own. We demonstrate that enforcing hard monotonic alignments enables robust TTS, which generalizes to long utterances, and employing generative flows enables fast, diverse, and controllable speech synthesis.

generative flow, glow-tts, text-to-speech, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Speech > Speech Synthesis (0.64)

Add feedback

Glow: Generative Flow with Invertible 1x1 Convolutions

Neural Information Processing SystemsNov-20-2025, 23:01:55 GMT

Flow-based generative models are conceptually attractive due to tractability of the exact log-likelihood, tractability of exact latent-variable inference, and parallelizability of both training and synthesis. In this paper we propose Glow, a simple type of generative flow using invertible 1x1 convolution. Using our method we demonstrate a significant improvement in log-likelihood and qualitative sample quality. Perhaps most strikingly, we demonstrate that a generative model optimized towards the plain log-likelihood objective is capable of efficient synthesis of large and subjectively realistic-looking images.

convolution, generative flow, name change, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.68)

Add feedback

MaCow: Masked Convolutional Generative Flow

Xuezhe Ma, Xiang Kong, Shanghang Zhang, Eduard Hovy

Neural Information Processing SystemsOct-9-2025, 13:29:54 GMT

Unsupervised learning of probabilistic models is a central yet challenging problem. Deep generative models have shown promising results in modeling complex distributions such as natural images (Radford et al., 2015), audio (V an Den Oord et al., 2016) and text (Bowman et al., 2015).

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
North America > Canada (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Supplementary Material of Glow-TTS: A Generative Flow for T ext-to-Speech via Monotonic Alignment Search Appendix A

Neural Information Processing SystemsOct-3-2025, 00:28:10 GMT

The detailed encoder architecture is depicted in Figure 7. We design the grouped 1x1 convolutions to be able to mix channels. Figure 8c shows an example. The decoder gets a mel-spectrogram and squeezes it. The, the decoder processes it through a number of flow blocks.

artificial intelligence, glow-tts, tacotron 2, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.30)

Add feedback

Ergodic Generative Flows

Brunswic, Leo Maxime, Clemente, Mateo, Yang, Rui Heng, Sigal, Adam, Rasouli, Amir, Li, Yinchuan

arXiv.org Artificial IntelligenceMay-7-2025

Generative Flow Networks (GFNs) were initially introduced on directed acyclic graphs to sample from an unnormalized distribution density. Recent works have extended the theoretical framework for generative methods allowing more flexibility and enhancing application range. However, many challenges remain in training GFNs in continuous settings and for imitation learning (IL), including intractability of flow-matching loss, limited tests of non-acyclic training, and the need for a separate reward model in imitation learning. The present work proposes a family of generative flows called Ergodic Generative Flows (EGFs) which are used to address the aforementioned issues. First, we leverage ergodicity to build simple generative flows with finitely many globally defined transformations (diffeomorphisms) with universality guarantees and tractable flow-matching loss (FM loss). Second, we introduce a new loss involving cross-entropy coupled to weak flow-matching control, coined KL-weakFM loss. It is designed for IL training without a separate reward model. We evaluate IL-EGFs on toy 2D tasks and real-world datasets from NASA on the sphere, using the KL-weakFM loss. Additionally, we conduct toy 2D reinforcement learning experiments with a target reward, using the FM loss.

artificial intelligence, init, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2505.03561

Country: North America > United States (0.34)

Genre: Research Report (0.82)

Industry: Government > Regional Government > North America Government > United States Government (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Review for NeurIPS paper: Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Neural Information Processing SystemsJan-24-2025, 18:18:02 GMT

Weaknesses: I was a little confused about how the grouped 1x1 convolutions interact with the coupling layers. If the standard (half-and-half) partitioning is used for the coupling layers and the grouped 1x1 convolutions never mix channels outside of their group of 4, then half of the channels will never be transformed by any coupling layer. I'm assuming the authors deal with this issue somehow (since the results are good), but I only briefly scanned the code and didn't want to work through all of the index gymnastics. I could see readers being confused by these missing details. Update: In their response, the authors said they will explain more of the details of the grouped 1x1 convolutions in their revised version.

generative flow, monotonic alignment search, vocoder, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (0.40)
Information Technology > Artificial Intelligence > Speech > Speech Synthesis (0.40)
Information Technology > Artificial Intelligence > Assistive Technologies (0.40)

Add feedback

Review for NeurIPS paper: Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Neural Information Processing SystemsJan-24-2025, 18:17:55 GMT

After rebuttal and discussion, all four reviewers provide very favorable reviews. The reviewers point out a novel methodology, combining flows with dynamic programming (hard monotonic alignment). The paper is therefore accepted for an oral.

generative flow, monotonic alignment search, text-to-speech, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (0.40)
Information Technology > Artificial Intelligence > Speech > Speech Synthesis (0.40)
Information Technology > Artificial Intelligence > Assistive Technologies (0.40)

Add feedback

Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Neural Information Processing SystemsOct-10-2024, 07:12:02 GMT

Recently, text-to-speech (TTS) models such as FastSpeech and ParaNet have been proposed to generate mel-spectrograms from text in parallel. Despite the advantage, the parallel TTS models cannot be trained without guidance from autoregressive TTS models as their external aligners. In this work, we propose Glow-TTS, a flow-based generative model for parallel TTS that does not require any external aligner. By combining the properties of flows and dynamic programming, the proposed model searches for the most probable monotonic alignment between text and the latent representation of speech on its own. We demonstrate that enforcing hard monotonic alignments enables robust TTS, which generalizes to long utterances, and employing generative flows enables fast, diverse, and controllable speech synthesis.

generative flow, glow-tts, monotonic alignment search, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Synthesis (1.00)
Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (0.66)
Information Technology > Artificial Intelligence > Assistive Technologies (0.66)

Add feedback

Glow: Generative Flow with Invertible 1x1 Convolutions

Neural Information Processing SystemsOct-8-2024, 19:30:50 GMT

Flow-based generative models are conceptually attractive due to tractability of the exact log-likelihood, tractability of exact latent-variable inference, and parallelizability of both training and synthesis. In this paper we propose Glow, a simple type of generative flow using invertible 1x1 convolution. Using our method we demonstrate a significant improvement in log-likelihood and qualitative sample quality. Perhaps most strikingly, we demonstrate that a generative model optimized towards the plain log-likelihood objective is capable of efficient synthesis of large and subjectively realistic-looking images.

convolution, generative flow, glow, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.82)

Add feedback

Filters

Collaborating Authors

generative flow

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

MaCow: Masked Convolutional Generative Flow

Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Glow: Generative Flow with Invertible 1x1 Convolutions

MaCow: Masked Convolutional Generative Flow

Supplementary Material of Glow-TTS: A Generative Flow for T ext-to-Speech via Monotonic Alignment Search Appendix A

Ergodic Generative Flows

Review for NeurIPS paper: Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Review for NeurIPS paper: Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Glow: Generative Flow with Invertible 1x1 Convolutions